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We propose a refined iterative likelihood-maximization algorithm for reconstructing a quantum 
state from a set of tomographic measurements. The algorithm is characterized by a very high 
convergence rate and features a simple adaptive procedure that ensures likelihood increase in every 
iteration and convergence to the maximum-likelihood state. We apply the algorithm to homodyne 
tomography of optical states and quantum tomography of entangled spin states of trapped ions and 
investigate its convergence properties. 
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I. INTRODUCTION. 

Quantum tomography (QT) is a family of methods for 
reconstructing a state of a quantum system from a variety 
of measurements performed on many copies of the state. 
QT is of particular importance for quantum information 
processing, where it is used to evaluate the fidelity of 
quantum state preparation, capabilities of quantum in- 
formation processors, communication channels, and de- 
tectors. Theoretically proposed in [l[ and first exper- 
imentally implemented in the early 1990s 0, QT has 
become a standard tool in many branches of quantum 
information technology. 

Aside from the experimental procedure of conducting a 
set of tomographically complete measurements on a sys- 
tem, QT requires a numerical algorithm for extracting 
complete information about the state in question from 
the measurement results. From a variety of algorithms 
proposed, two main approaches have become popular 
among experimentalists. One approach is based on linear 
inversion: because the statistics of the measurement re- 
sults is a linear function of the density matrix, the latter 
can be obtained from the former by solving a system of 
linear equations. Examples are the inverse Radon trans- 
formation Q or the quantum state sampling method 0] , 
that were almost exclusively used in optical homodyne 
tomography until recently. 

The second approach is maximum-likelihood (MaxLik) 
quantum state reconstruction, which aims to find, among 
all possible density matrices, the one which maximizes 
the probability of obtaining the given experimental data 
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set [5j. To date, the maximum-likelihood approach has 
been applied to various quantum problems from quan- 
tum phase estimation [3] to reconstruction of entangled 
optical states @, Q . 

MaxLik reconstruction has several advantages with re- 
spect to linear inversion. First, with linear inversion, 
statistical and systematic errors of the quantum mea- 
surements are transferred directly to the density matrix, 
which may result in unphysical artifacts such as nega- 
tive diagonal elements. Second, MaxLik allows one to 
incorporate additional information that may be known 
about the density matrix into the reconstruction proce- 
dure. Third, experimental imperfections (such as detec- 
tor inefficiencies) can be directly incorporated in to the 
MaxLik reconstruction procedure. 

One approach to quantum MaxLik reconstruction is to 
express the density matrix as a function of a set of inde- 
pendent parameters, in a way that upholds the positiv- 
ity and unity-trace constraints for all parameter values. 
Then one can apply any iterative optimization method 
to find the set of parameter values that maximize the 
likelihood. Because the log likelihood function for QT 
is convex, the optimization problem is well behaved and 
most iterative optimization methods are guaranteed to 
converge to the unique solution. This approach was used 
by James et al. in their work on tomography of optical 
qubits In application to homodyne tomography, the 
method was elaborated by Banaszek et al. [Hj and used 
in an experiment by DAngelo et al. (Tl| . 

Generic numerical optimization methods are often slow 
when the number of parameters (the square of the Hilbert 
space dimension) is large. An alternative algorithm de- 
scribed below, which takes advantage of the structure of 
the MaxLik reconstruction problem and has good conver- 
gence properties was proposed by [12J and later adapted 
to different physical systems such as external degrees of 
freedom of a photon [l3| and the optical harmonic oscilla- 
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tor [Tj]. Thanks to its good properties, this method has 
been widely used in recent experiments on optical homo- 
dyne tomography of both single- and multimode optical 
states [l5j . Despite its success, no argument guarantee- 
ing monotonic increase of the likelihood in every itera- 
tion step has been presented. Although to our knowledge 
the experimental practice has not yet faced a counterex- 
ample, theoretically such counterexamples do exist and 
there remains a risk that the algorithm could fail for a 
particular experiment. 

In this paper, we propose an iteration which depends 
on a single parameter e that determines the "length" of 
the step in the parameter space. For e — * oo, the itera- 
tion becomes that of Ref. [HI, [3| • On the other hand, 
we prove that the likelihood will increase in every itera- 
tion step for e — * 0. We thus obtain a simple adaptive 
procedure, which, by choice of parameter e, allows us to 
find a compromise between the convergence rate and the 
guarantee on the likelihood increase. 



II. THE NONLINEAR ITERATIVE 
ALGORITHM. 

We now describe the iterative scheme used in Refs. 
[lH , Il4j |. A generic tomographic measurement is de- 
scribed by a positive-operator- valued measure (POVM), 
with the outcome of the j'th measurement associated 
with a specific positive operator II j > 0, with ^2j Hj 
normalized to the identity operator. In the case of sharp 
von Neumann measurements, H, is a projection operator. 

Let N be the total number of measured quantum 
systems and fj be the number of occurrences for each 
measurement result H,. The likelihood of a particu- 
lar data set {fj} for the quantum state p is given by 

C{p) =ILpr?, with 



pr ■ = Tr[Ujp] 



(1) 



being the probability of each outcome. 

Our goal is to find the density matrix p which maxi- 
mizes the log-likelihood 



log£(p) =^/ J -log(pr i ). 



(2) 



As was shown in Ref. a state po that maximizes the 
likelihood ^ obeys a simple nonlinear extremal equation 

R(po)po = PoR(po) = Po, (3) 
where we introduced the state dependent operator 



Rfp) = lyin, 

Kf> N ^ pr, 3 

7 J 



(4) 



Note that R(p) is a non-negative operator. Following Ref. 
[l6| , where a similar method was proposed to estimate an 



unknown quantum measurement, Eq. ([3]) can be stated 
in a slightly different but equivalent form 



R{po)p R(p ) = p . 



(5) 



For simplicity, we assume that the measurements are suf- 
ficient to ensure that there is a unique maximum likeli- 
hood state po- 
rn the case where the density matrix p is restricted 
to matrices that are diagonal, the problem of finding 
a solution to Eq. © can be solved by the well-known 
expectation-maximization algorithm Q. If R is always 
diagonal in the same basis, expectation-maximization re- 
duces to computing the next iterate according to p( k+1 ^ = 
R(p( k ))p( k ) in the hope of converging to a fixed point 
that necessarily satisfies Eq. ([3]). The expectation- 
maximization algorithm is guaranteed to increase the 
likelihood at every iteration step. However, this itera- 
tion cannot be used for the quantum problem because 
without the diagonal restriction, it does not preserve the 
positivity of the density matrix. A possible remedy is 
to apply the expectation-maximization iteration to the 
diagonalized density matrix followed by a unitary trans- 
formation of the density matrix eigenbasis (H. Il7|. 

Refs. [H, [3] instead propose to base the iterative 
algorithm on Eq. $5}). We choose an initial density matrix 
such as p^ ' = Af[l] (which avoids any initial problems 
with zero pr •), and compute the next iterate from 



p( fe ) using 



P^ =N\k{p {k) )p {k) R{p {k) ) 



(6) 



where Af denotes normalization to trace 1 and the posi- 
tivity of the density matrix is explicitly preserved in each 
step. Hereafter we refer to scheme of Eq. ^ as the u RpR 
algorithm" . 

Despite the RpR algorithm being a quantum gen- 
eralization of the well-behaving classical expectation- 
maximization algorithm, its convergence is not guaran- 
teed in general. This is evidenced by the following coun- 
terexample. Assume that we made three measurements 
on a qubit with a single apparatus with Ho = 1 0) (0 1 , TIi = 
|1) (1|, detecting |0) once and |1) twice. The measurement 
is tomographically incomplete because no information is 
gained about the off-diagonal elements of the density ma- 
trix. From Eq. (0J), we find R = (n /p o + 2ili/pii)/3. 
Using the uniformly mixed p^ = IIo/2 + IIi/2 as a start- 
ing point, we obtain, in step 1: 

i?=~n + ^n i; p w =n /5 + 4fl 1 /5; (7) 



and in step 2: 



fl=|Do + |ni; 
3 6 



p^ =p(0). 



(8) 



The iterations produce a cycle of length two. The second 
step strictly decreases the likelihood. 
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III. THE "DILUTED" ITERATIVE 
ALGORITHM. 

To improve the convergence of the RpR iteration let us 
modify it along the lines used for calculating the mutual 
entropy of entanglement in [isj |. namely by mixing the 
generator of the nonlinear map ([5]) with a unity operator 



1 + eR 
1 + e 



P 



(*): 



eR 



1 



(9) 



where e is a positive number. Loosely speaking, the non- 
linear map is diluted and the iteration step is controlled 
by e. Now let us prove that using the modified algorithm 
©, the likelihood is increased in each step if e <C 1 is 
sufficiently small. 

In the linear approximation with respect to e, we can 
rewrite Eq. ([5]) as 



with 



p {k+1) = p {k) + Ap (10) 



Ap = e(RpW +p(*>A-2pW). (11) 



To obtain Eqs. (fit)]) and (jXTJ) , we approximated (1 
e)~ 2 ks 1 — 2e and used the relation 



Tr(Rp) = Tr(pR) = 1, 



(12) 



which is a consequence of the definition (0| of R. The 
normalization factor TV is 1 to first order in e. 

We now evaluate the likelihood associated with the new 
state p( k+1 *> and compare it to that of p^ k \ neglecting 
terms of second and higher order in e: 

\ogC(p^) = ^/,logTr(n i p( fe + 1 )) (13) 
= ^/ J log[ P r J .+Tr(fl i Ap)] 



= log£(pW) + V — Tr(ILjA/3) 

= log£(pW)+Tr(i?Ap) 

= log£(pW) + 2e[Tr(i? j o(^i?) - 1] 



In the second equality above, we used Eq. {U; m the 
fourth, the definition ^ of the likelihood and the ap- 
proximation log(l + a) ss a for a <C 1; and in the sixth, 
the cyclic property of the trace and Eqs. (jlip and (fT2]) . 
We complete the proof by showing that 



Tr(RpR) = Tr(RpR)Trp > Tr 2 (Rp) = 1. 



(14) 



Indeed, the positive density matrix has a 
positive square root p = (p 1 / 2 ) 2 , and thus 



Tr(RpR)Trp = (i?p 1/2 , Rp l l 2 ){p 1 ' 2 , p 1 ' 2 ) and 
Tr 2 (Rp) = KRp 1 / 2 ,/? 1 / 2 )] 2 , where the scalar product of 
matrices is defined as (A,B) = Y,i,j A *j B ji = Tr(itfl). 
Consequently the Cauchy-Schwarz inequality can be 
applied to yield the inequality in Eq. (fT4"|) . 

We have thus proven that under the application of iter- 
ations ([9]) , the likelihood is non-decreasing provided that 
e is chosen sufficiently small in every step. Suppose we 
find a density matrix p such that there is no e > that 
yields a proper increase in the likelihood when the itera- 
tion (J9j) is applied. Then 



Tr(RpR) = 1. 



(15) 



According to Eqs. (JT2J) , (JX4J) , and the equality condition 
in the Cauchy-Schwarz inequality, Eq. (fT5j) can be ful- 
filled if and only if Rp 1 / 2 = p 1 / 2 or, equivalently, Rp = p. 
The latter equality characterizes the maximum likelihood 
state, so p = po- 

Proper use of the diluted iterations requires a strategy 
for choosing e at each step. Asymptotic convergence of 
the diluted iterations may depend on this strategy. As 
we show in the Appendix, one strategy that converges 
to po is to choose the e which maximizes the likelihood 
increase in every iteration. However, this strategy is com- 
putationally expensive because it requires solving a one- 
dimensional optimization problem. 

One possible alternative approach is as follows. 

• begin with the RpR ([6]) iterations, which are identi- 
cal to the diluted iterations © with e — > oo. Verify 
that the likelihood increases in each step. 

• In the event the likelihood does not increase and 
before terminating the iterations, use the diluted 
iteration ([9]), trying smaller values of e to determine 
whether significant increases in likelihood are still 
possible. If so, continue the iterations as needed 
with these smaller values of e. 

• When the iterations appear to have converged or 
stagnated, find the value of e at which the likeli- 
hood increase is maximized and attempt additional 
iterations using this value. If the likelihood and/or 
the density matrix does not exhibit significant fur- 
ther changes, one can be sure the iteration sequence 
has converged to the maximum-likelihood solution. 

Another approach is to choose e randomly according to 
a distribution with nonzero density in a neighborhood of 
0. To ensure non-decreasing likelihood, each iteration re- 
quires repeatedly choosing e randomly until one is found 
for which the likelihood increases. The argument in the 
Appendix can be expanded to show that if e is chosen 
in this way, then the iteration has a non-zero probabil- 
ity of escaping from any non-maximum likelihood density 
matrix. 

We note again that in all practical cases studied so 
far the RpR algorithm exhibited good convergence and 
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monotonic increase of the likelihood. The diluted itera- 
tion may become necessary for low-dimensional systems 
where the nonlinear RpR iteration may "overshoot". 
Characterizing the situations where this can happen is 
an open problem. 

Finally, let us mention that in some tomography 
schemes, one or more POVM elements (measurement 
channels) IF, are not accessible and, consequently, G = 
Y]- Hj may not be normalizable to the unity operator 
on the reconstruction subspace. Then the extremal map 
© should be replaced by G^ 1 R^^poRiPo)^^ 1 = po to 
avoid biased results, see e.g. [8]. Obviously, the corre- 
sponding iterative procedure can be diluted in a similar 
way as was done with the original RpR algorithm. 



IV. EXAMPLES 

First, consider the counterexample discussed above. A 
simple numerical test shows that replacing the RpR it- 
eration by ^ warrants convergence for any finite e; the 
likelihood monotonically increases for e < 25.7. 

Second, we studied the dataset of 14,153 points ob- 
tained in the experiment on homodyne tomography of 
the coherent superposition of the vacuum and the single- 
photon Fock state [19j. This is the same dataset as 
that analyzed in Ref. [lj]. This reference discusses the 
specifics of application of the likelihood-maximization 
procedure to continuous-variable measurements. We 
studied the dependence of the convergence speed on the 
parameter e. 

The Hilbert space was restricted to 14 photons. We 
first ran the iterations for a very long time until the den- 
sity matrix and the likelihood no longer changed. In this 
way, we obtained the density matrix p that maximizes 
the likelihood for this dataset with high accuracy (limited 
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FIG. 1: Iterations required for homodyne tomography data. 
The number of iterations required for convergence as a func- 
tion of e. Results for three tolerances are shown: 10 -3 (a), 
10 -5 (b) and 10 -7 (c). The rightmost column represents the 
RpR algorithm (e — > oo). The dataset is the same as that 
used in Ref. 14,153 quadrature samples of the state ap- 

proximating a coherent superposition of the vacuum and the 
single-photon states. 



by the floating point representation). 

We then re-initialized the density matrix and ran the 
diluted RpR algorithm with various values of e. Wc 
repeated the iterations until the pairwisc difference be- 
tween all matrix elements of p^ and po was below a 
pre-selected tolerance for each matrix element. Three 
tolerance values were investigated: 10 -3 , 10~ 5 and 10~ 7 . 
The numerical experiment was conducted on a 2. 8- GHz 
Pentium 4 computer [2l|. The code was written in Del- 
phi 2l|. Each iteration took about 0.3 s. 

The result of this experiment is shown in Fig. [TJ The 
RpR algorithm showed monotonic likelihood increase and 
converged to the set tolerances within 15, 30, and 49 iter- 
ations, respectively. The convergence rate of the diluted 
algorithm improves with increasing e and approached 
that of the RpR algorithm for large values of e. One 
sign of systematic overshoot of the RpR algorithm would 
be a minimum in the three curves of Fig. [T] at e < oo. 
We did not observe such an effect. 

Third, we considered a dataset consisting of 21,832 in- 
dividual experiments with four ion qubits. The goal of 
the experiment was to purify one entangled pair of ions 
from two [2(j. In order to determine the fidelity of the 
purified pair and for the purpose of checking that the 
experiment did not introduce spurious entanglement, in- 
complete state tomography was used. Specifically, each 
tomographic measurement involved first determining the 
number of qubits in state |0) among the first pair of 
qubits and then performing a pair of 7r/2 pulses at vari- 
ous phases on the second pair of qubits (the purified pair) 
before determining its number of ions in state |0). Repe- 
tition of these combinations of pulses and measurements 
suffices for determining the fidelity of the purified entan- 
gled state. The actual measurements involve counting 
the number of photons scattered from an ion pair. This 
number has a Poissonian distribution whose mean de- 
pends on the number of ions in state |0). Thus, each 
experiment results in two counts, one from each mea- 
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FIG. 2: Required iterations for ion-qubit tomography data. 
See the caption of Fig. [TJ for an explanation of the axes and 
interpretation of the curves. The tolerances used are 10 -2 
(a), 10" 3 (b), 10" 4 (c) and 10" 7 (d). The underlying data 
was used in the analysis for [2p( ] . 



5 



surement. Every combination of counts can be associated 
with a measurement operator of a POVM that also de- 
pends on the phases in the pair of ir/2 pulses. Although 
we cannot determine the density matrix of the complete 
four-qubit state p with these measurements, there is suf- 
ficient information to deduce the density matrix p' ob- 
tained from p by phase decohering the first pair of qubits 
in the logical basis and then symmetrizing each pair of 
qubits. The symmetrization process is equivalent to ran- 
domly switching the qubits in each pair. The diluted 
RpR iteration with the appropriate POVMs preserves 
the decohered and symmetrized form of density matri- 
ces. Starting from the completely mixed initial state, it 
converges to the maximum likelihood solution for p 1 . The 
code for the ion-qubit tomography was written in R [22j 
and required about .3 s per iteration on a 1.6 GHz Pen- 
tium 4 laptop. The behavior of the iterations is shown 
in Fig. [2] and is similar to the behavior of the iterations 
for the homodyne tomography shown in Fig. [1] Again, 
no sign of overshoot was detected in these curves. 

In summary, we have proposed an iterative likelihood- 
maximization procedure for quantum tomography, which 
is applicable when the RpR iteration does not monoton- 
ically increase the likelihood. We have found the suf- 
ficient condition under which the iterations converge to 
the maximum-likelihood solution. The new algorithm 
has been tested on two sets of experimental data. 

Acknowledgements 

We thank J. Fiurasek for helpful discussions, D. 
Leibfried for use of the ion trap data and S. Glancy and T. 
Gerrits for their help in reviewing the paper. This work 
was supported by NSERC, CFI, AIF, Qu&ntumWorks 
and CIAR (A.L.); by the Czech Ministry of Educa- 
tion, Project MSM6198959213, Czech Grant Agency, 
Grant 202/06/307 and the European Union project CO- 
VAQIAL FP6- 511004 (J.R and Z. H.). Contributions to 
this work by NIST, an agency of the US government, are 
not subject to copyright laws. 



APPENDIX A: CONVERGENCE OF DILUTED 
ITERATIONS 

Consider the diluted iterations ([5]). For any fixed e, 
we cannot exclude the possibility that the iteration stag- 



nates, even if the likelihoods continue to increase. The 
problem is that the direction of change in p for small 
e, which is computed as a fixed function of p, may dif- 
fer substantially from the direction of steepest ascent. 
Although the likelihood is guaranteed to increase for suf- 
ficiently small e, the direction of change could become 
increasingly parallel to surfaces of constant likelihood, 
thus leading to an iteration that never reaches the max- 
imum likelihood solution pQ. Alternatively, if e is held 
fixed, the iterations could converge into a limit cycle or 
a more complicated limit set, thus avoiding pq. 



Here we show that if at each step, e is chosen to max- 
imize the likelihood increase, the iterations converge to 
Pq in the limit. To sec this, wc first notice that R{p) is 
continuous as a function of p on the set S of density ma- 
trices for which the likelihood is not 0. This is because, 
if p € S, then pr^ > for all j with fj > 0. It follows 

that the iterate / defined by Eq. (J9]) is also a continuous 
function of the density matrix and e > 0. 

The initial density matrix p^ (the completely mixed 
state) is in S because, for any j, pij(p^) = Trpj^ )] oc 
Tr[II_y] > 0. The choice of e guarantees that the likelihood 
is non-decreasing, so each subsequent iterate p^ must be 
in S as well. The sequence p^ is bounded and must thus 
have at least one limit point pi , which also belongs to the 
interior of S. 



Suppose that pi ^ pq. As we showed in the text, this 
implies that the likelihood of I(pi, e) strictly increases for 
sufficiently small e. In particular, there is a 5 > and an 
e, such that the likelihood increase at pi is at least 5. Be- 
cause the likelihood increase is also a continuous function 
of p and e on a neighborhood of pi, there is a (possibly 
smaller) neighborhood S$ in which the maximum likeli- 
hood increase exceeds 5/2. Because p\ is a limit point, 
one can choose an iterate fft*> in S$ so that its likelihood 
is within (say) 5/4 of that of pi. Then the next iterate's 
likelihood exceeds that of pi by at least 5/4. Since the 
likelihood is non-decreasing, and by continuity, future it- 
erates cannot have pi as a limit point, contradicting the 
assumption on pi. We conclude that pi — po, as desired. 
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